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INTRODUCTION 


This  report  presents  the  theory  underlying  a  possible  solu¬ 
tion,  via  adaptive  noise  canceling,  to  a  cosite  interference  prob¬ 
lem  encountered  by  co-located  frequency  hopping  radios.  When  two 
or  more  such  radios  and  their  antennas  are  independently  operated 
in  close  proximity,  i.e.,  in  a  jeep  or  communication  shelter,  a 
cosite  interference  problem  can  develop.  In  this  type  of  situa¬ 
tion,  the  radio  may  not  be  able  to  meet  its  specified  bit-error- 
rate.  A  degraded  bit  error  rate  means  that  the  radio  receiver's 
sensitivity  will  be  degraded,  which  results  in  a  decreased  communi¬ 
cations  range. 

This  type  of  interference  problem  is  caused  by  the  transmit¬ 
ter's  strong  signal  being  too  close  to  the  frequency  of  the  de¬ 
sired,  weaker  signal,  trying  to  be  received.  The  difference  in 
power  levels  between  the  strong  interfering  transmitter  signal  at 
the  receiver  input  and  the  minimum  signal  the  receiver  is  capable 
of  detecting  could  be  in  excess  of  130  dB.  For  more  details  on  a 
typical  cosite  scenario  (signal  and  interfering  power  levels, 
frequency  separation,  required  suppression,  etc.)  see  Reference  18. 

The  receiver  may  not  be  able  to  provide  the  entire  130  dB  of 
interference  rejection  filtering  needed  at  the  transmitter  frequen¬ 
cy.  Therefore,  an  external  applique  capable  of  supplying  the 
additional  filtering  may  be  required.  An  Adaptive  Noise  Canceler 
with  a  single  input  is  one  possible  way  of  providing  the  additional 
filtering  required. 

Adaptive  noise  cancelers  are  not  limited  to  separating  narrow- 
band  signals  that  are  close  in  frequency,  i.e.,  they  are  not 
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limited  in  application  to  just  frequency  hopping  radios.  A  partic¬ 
ular  type  of  adaptive  noise  canceler  known  as  an  Adaptive  Line 
Enhancer  (ALE)  is  capable  of  separating  narrow-band,  deterministic 
signals  from  random  wide-band  signals  (e.g.,  it  is  capable  of 
protecting  a  weak  wide-band,  direct  sequence  spread  spectrum  signal 
from  a  strong,  interfering,  narrow-band  signal) . 

Initially,  the  theoretical  steady-state  performance  of  both  an 
adaptive  noise  canceler  with  a  single  input  and  an  adaptive  line 
enhancer  will  be  described  by  assuming  that  the  adaptive  process 
has  "converged1*  (i.e.,  the  tap  filter  weights  are  no  longer  chang¬ 
ing)  .  These  adaptive  filters  can  then  be  approximated  by  and 
understood  as  Wiener  filters. 

A  Wiener  filter  is  essentially  a  transversal  filter  that 

produces  an  optimum  output  in  a  minimum  mean  square  sense.  A 

Wiener  filter  is  shown  in  Figure  1.  The  output  of  a  transversal 

filter  is  subtracted  from  a  "desired"  response,  d,  that  is  similar 

to  but  not  exactly  the  same  as  the  signal  to  be  detected.  The 

Wiener  weights  of  the  transversal  filter  are  designed  to  minimize 

n 

the  mean  square  error  =  E  [  (d  -  E  Wjt  X^..-^)2)  at  the  output  of 

i=0 

the  summer.  When  the  Wiener  weights  are  used,  the  transversal 
filter  gives  an  optimum  or  best  estimate  of  the  true  signal  value 
(the  signal  that  d,  the  desired  response,  is  similar  to) . 

In  an  effort  to  explain  how  an  adaptive  noise  canceler  with 
single  input  and  an  ALE  actually  work,  the  functional  relation¬ 
ship  between  the  optimal  or  Wiener  PTF  weight  values  (and  hence 
the  PTF  frequency  response)  and  the  interfering  and  intended  signal 
are  developed  in  much  more  detail  than  is  found  in  textbooks  or 
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review  articles.  Building  on  this  analytical  foundation  is  then 
shown  why: 

1.  For  the  case  of  a  weak  narrow-band  intended  signal  versus 
a  strong  narrow-band  interferes  the  frequency  response  of  the  PTF 
within  an  adaptive  noise  canceler  with  single  input  is  dominated  or 
controlled  by  the  strong  interfering  signal.  This  results  in  a  PTF 
passband  and  an  adaptive  noise  canceler  notch  around  the  interfer¬ 
ing  frequency. 

2.  For  the  case  of  either  a  weak  random  wide-band  intended 

-7 

signal  versus  a  strong  narrow-band  interferer  or  the  case  of  a  weak 
narrow-band  intended  signal  versus  a  strong  random  wide-band  inter¬ 
ferer,  the  frequency  response  of  the  PTF  in  an  ALE  is  determined  by 
the  narrow-band  signal.  This  results  in  a  PTF  passband  around  the 
narrow-band  frequency  and  a  notch  in  the  ALE  output  at  this  same 
narrow-band  frequency. 

After  the  steady-state  performance  of  the  subject  adaptive 
filters  has  been  described,  three  different  adaptive  algorithms 
(Differential  Steepest  Descent,  Least  Mean  Square,  and  Random 
Search)  are  introduced.  These  algorithms  describe  how  the  adaptive 
filter  tap  weights  must  be  iteratively  modified  in  order  to  ap¬ 
proach  a  “steady-state"  condition. 

Finally,  a  SAW  device  implementation  of  a  FfF  that  could  be 
used  in  building  an  adaptive  noise  canceler  with  single  input  or  an 
ALE  is  described.  Performance  levels  (maximum  input  power,  inter¬ 
ferences  suppression,  and  switching  speed)  are  given  in  order  to 
illustrate  its  capabilities. 
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ADAPTIVE  NOISE  CANCELING 

An  Adaptive  Noise  Canceler  as  shown  in  Figure  2  works  as 
follows: 

"A  signal  is  transmitted  over  a  channel  to  a  sensor  that 
receives  the  signal  plus  an  uncorrelated  noise  N0.  The  combined 
signal  and  noise  S  +  NQ  form  the  primary  input  to  the  canceler.  A 
second  sensor  receives  a  noise  N^,  which  is  uncorrelated  with  the 
signal  but  correlated  in  some  unknown  way  with  the  noise  N0.  This 
sensor  provides  the  reference  input  to  the  canceler.  The  noise  N]^ 
is  filtered  to  produce  an  output  Y  that  is  a  close  replica  of  N0. 

This  output  is  subtracted  from  the  primary  input  s  +  nq  to  produce 

the  system  output,  S  +  N0  -  Y."1 

The  output  of  the  canceler  is  used  to  modify,  via  an  appro¬ 
priate  adaptive  algorithm,  the  frequency  response  of  the  adaptive 
filter. 

The  adaptive  filter  will  usually  be  implemented  as  a  program¬ 
mable  transversal  filter  (PTF)  (see  Figure  3) .  A  transversal 
filter  is  the  preferred  implementation  because: 

1.  It  is  one  of  the  simplest  filter  structures.  The  filter 
output  is  simply  the  sum  of  delayed  and  scaled  inputs. 

2.  There  is  no  feedback  from  the  taps  to  the  input. 

3.  It  is  stable.  Since  there  is  no  feedback,  a  finite  filter 

input  produces  a  finite  filter  output. 

4.  It  has  a  linear  phase  characteristic,  i.e.,  it  produces  a 
phase  shift  that  is  linearly  proportional  to  frequency.  It 
can  be  shown1^  that  if  a  signal  is  to  be  passed  through  a 
linear  system  without  any  resultant  distortion,  the  overall 
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system  frequency  response  must  have  a  constant  amplitude 
gain  characteristic  over  the  frequency  spectrum  of  the 
input  signal  and  its  phase  shift  must  be  linear  over  the 
same  frequency  spectrum.  Filtering  without  distortion  is 
important  for  adaptive  noise  canceling:  because  the  adap¬ 
tive  filter  must  pass  the  interference  without  distortion 
so  that  it  can  be  subtracted  (at  the  summer)  from  the 
unfiltered  interferer.  If  the  adaptive  filter  introduces 
distortion  then  the  summer  is  n.o  longer  subtracting-  two 
identical  interferers. 

5.  There  is  a  simple  and  analytically  tractable  relationship 
between  the  frequency  transfer  function  of  a  transversal 
filter  and  its  parameters  (see  equation  47) .  The  complicated 
nonlinear  relationship  between  parameters  and  transfer  func¬ 
tion  for  most  other  filter  structures  makes  the  analysis  and 
calculation  of  adaptive  algorithms  much  more  difficult  than 
for  transversal  filters. 

6.  Widrow's  algorithm,  one  of  the  most  widely  used  adaptive  algo¬ 
rithms,  assumes  a  transversal  filter  structure. 

A  PTF  forms  a  weighted  sum  of  delayed  versions  of  the  input 
signal.  It  is  programmable  in  that  the  weights  can  be  changed. 
Changing  the  weights  changes  the  frequency  transfer  function  of  the 
PTF.  A  PTF  is  identical  in  structure  to  a  programmable  finite 
impulse  response  (FIR)  digital  filter. 

The  specific  technology  used  to  implement  a  PTF  will  depend  on 
the  frequency  range  of  interest.  For  VHF  and  UHF  applications, 
Surface  Acoustic  Wave  (SAW)  devices  are  an  appropriate  technology. 
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ADAPTIVE  NOISE  CANCELING  WITH  A  SINGLE  INPUT 


Before  an  adaptive  noise  canceler  can  be  implemented,  a 
reference  signal  correlated  with  the  interfering:  signal  but  not  the 
intended  signal  must  be  generated.  When  the  interfering  signal 
Nq  is  much  stronger  than  the  intended  signal  S;  the  reference 
signal  can  be  generated  by  modifying  the  adaptive  noise  canceler  of 
Figure  2  to  give  the  circuit  shown  in  Figure  4.  In  Figure  4  the 
primary  and  reference  inputs  are  connected  together.  In  effect. 
Figure  4  assumes  that  the  reference  input  is  equal  to  the  primary 
input.  This  may  at  first  appear  contradictory.  The  reference 
input  has  to  be  correlated  to  the  interference  NQ,  not  the 
signal  S.  But  since  the  signal  S  is  part  of  the  primary  input,  it 
will  be  part  of  reference  input  if  the  reference  input  equals  the 
primary  input  as  per  Figure  4.  Hence,  the  reference  input  appears 
to  be  correlated  to  the  signal  also.  When  the  interfering  signal 
N0  is  much  larger  than  the  intended  signal  (N0  >>  S) ,  the  apparent 
contradiction  is  resolved.  In  this  case  the  reference  input  (N^ 
=  S  +  N0  =  primary  input)  is  highly  correlated  with  and  "looks" 
like  the  interfering  signal  N0  (i.e.,  N^  ~  N0) . 

While  S  is  a  component  of  N^  and  therefore  will  correlate  to  a 
certain  extent  with  N1;  N0  is  so  much  larger  than  S  that  N^  will  be 
much  more  highly  correlated  to  N0  than  S.  So  to  a  very  good 
approximation,  the  reference  input  N-^  is  correlated  to  the 
interference  N0  not  the  signal  S.  This  is  what  was  to  be  proved. 

It  will  now  be  shown  why  the  reference  input  must  be  corre¬ 
lated  to  the  interference  and  not  the  signal.  The  adaptive  filter 
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.  Adaptive  Noise  Canceier  with  a  Single  input 


within  the  canceler  must  filter  the  reference  input  to  produce 
an  output  Y  that  is  a  close  replica  of  N0.  If  is  not  correlated 
to  N0,  i.e.,  if  does  not  "look"  somewhat  like  N0,  then  no  amount 
of  filtering  can  make  Y  look  like  N0.  To  prove  that  the  reference 
input  (or  primary  input)  of  Figure  4  is  more  highly  correlated  to 
the  interference  than  to  the  signal,  first  note  than,  the  reference 
input  equals 


(S  +  N0)A/2 

where: 

S  =  input  signal  amplitude 

N0  =  input  "noise"  or  interference  amplitude 

The  factor  1/72  appears  because  the  input  power  splitter  is 
assumed  to  evenly  split  the  power  associated  with  the  signal  and 
interference  amplitudes  S  and  N0.  Since  power  is  proportional  to 
amplitude  squared,  reducing  power  by  a  factor  of  2  means  that 
amplitude  is  reduced  by  J2.  at  each  output  of  the  input  power  split¬ 
ter. 

Since  we  are  assuming  that  N0  is  much  larger  than  S,  i.e., 

Nq  »  S,  it  follows  that  (S+ND)/72  is  more  highly  correlated  with 
N0  than  with  S.  To  be  more  explicit,  if  we  define2  the  average 
cross-correlation  R^2(t)  between  two  waveforms  V^(t)  and  V2(t)  as 

R,2<T)"“St  J_T/2  V'(tl  Vz  <t+T)dt  (1) 

where  r  is  the  relative  time  displacement  between  the  two  wave¬ 
forms  V-j.  and  V2.  Then  the  correlation  between  the  reference  input 
and  the  noise  input  is 
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»  T/2  Is  (t)  +Nn  (t ) 

-  !j=l  -  •><><*♦<•: 


The  correlation  between  the  reference  input  and  the  signal  input 


r  T/2  S ( t) +N0  ( t ) 

®(r.fK~l»l<')  H»  I  f  /,  °  S  (t  +  T)  dt 

T-»  T  ■>  -T/2  [  ■fT 

Since  by  assumption  N0>>S,  at  r  =  0  the  dominant  term  in  the 
integrand  of  equation  (2)  for  R(Ref) (Noise) (°)  will  be  (N0 (t) ) 2 
i.. e.,  the  limit  of  the  integral  can  be  approximated  by 

lim  1  f  *1/2  N0"('t))2dt 

T-*oo  t  J-T/2  - y— - - K(Ref)(Noise) ' 

In  a  similar  analysis,  the  dominant  term  in  the  integrand  of 
equation  (3)  for  R(Ref)  (Signal)  ^  will  be  N0(t)S(t).  The  li^it 
of  the  integral  can  be  approximated  by 


lim  If  T/2  S(t).  N0(t)dt_D 
T-*a>  T J  -T/2  - - K(Ref)(Signal)^u/ 


N0  >>  S  implies  that 

(N0(t))2  »  N0(t)S(t)  (6) 

Since  (N0(t))2//2  is  the  approximate  integrand  of  R(Ref)  (Noise) ^ 
and  N0 (t)  S(t)//2  is  the  approximate  integrand  of 

R(Ref ) (Noise) (°) , 

equations  (4)  and  (5)  and  inequality  (6)  imply  that 

R(Ref ) (Noise) (o)  »  R(Ref) (Signal) (o)  (7) 
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In  other  words  inequality  (7)  indicates  that  the  reference 
signal  is  much  more  highly  correlated  with  the  noise  than  with  the 
signal,  as  was  to  be  demonstrated.  This  means  that  the  reference 
signal  "looks"  more  like  the  interference,  NQ,  than  the  signal  S. 

As  the  adaptive  algorithm  iterates,  it  will  cause  the  adap¬ 
tive  filter  to  form  a  bandpass  around  the  interfering  frequency, 

FNq.  If  the  PTF  has  been  properly  designed,  then  the  resulting 
bandpass  filter  will  "pass"  FNq  the  interfering  frequency  and 
"reject"  the  intended  signal  frequency.  Then  the  output  of  the 
adaptive  filter  (the  filtered  reference  signal)  will  "look"  even 
more  like  N0/,/2  than  the  input  signal.  When  this  output  is  sub¬ 
tracted  from  (S  +  N0)//2,  at  the  summer,  a  signal  very  similar  to 
S//2  will  remain.  The  interference  has  been  canceled.  The  circuit 
shown  in  Figure  3  does  indeed  behave  as  an  adaptive  noise  canceler. 
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ADAPTIVE  LINE  ENHANCER 


The  discussion  up  to  this  point  was  only  concerned  about 
protecting  a  narrow-band  signal  from  narrow-band  interference.  It 
is  also  desirable  to  be  3..1!. .  vj  separate  narrow-band  signals  from 
random  broad-band  sigr.a.  «-  .vaveforms  encountered  ir.  communica¬ 

tions  systems  are  in  many  c  .ses  unpredictable.  A  random  signal  is 
often  an  appropriate  model  .  ir  a  real  signal.  The  following  dis¬ 
cussion  will  deal  with  separating  both: 

1 .  A  weak  random  broad-band  signal  from  a  strong  narrow-band 
interferer,  and 

2.  A  weak  narrow-band  signal  from  a  strong  random  broad-band 
interferer . 

An  Adaptive  Line  Enhancer  (ALE)  illustrated  in  Figure  5  is  one 
possible  method  of  performing  this  signal  separation.  An  adaptive 
line  enhancer  di‘.  ^ers  from  an  "Adaptive  Noise  Canceler  with  a 
single  input"  as  shown  in  Figure  4  in  that,  a  delay  has  been  intro¬ 
duced  preceding  the  adaptive  filter.  In  order  to  understand  how  an 
ALE  works,  a  more  detailed  analysis  of  Figure  4  will  be  necessary. 
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Figure  5.  Adaptive  Line  Enhancer  (From  Ret  8) 


ANALYSIS  OF  AN  ADAPTIVE  NOISE  CANCELER  WITH  A  SINGLE  INPUT 


After  the  adaptive  process  has  converged,  the  performance  of 
the  filter  in  the  adaptive  noise  canceler  of  Figure  4  can  be  ap¬ 
proximated  by  a  Wiener  filter.  This  means  that  after  convergence 
the  adaptive  algorithm  has  produced  (by  adjusting  the  adaptive 
filter  frequency  response)  a  system  output  ((S+N0)//2)  -  Y,  that  is 
a  best  fit  in  a  minimum  mean  square  error  sense  to  S/72.  In  other 
words,  the  mean  square  error  is  minimized,  i.e.,  the  average  value 
taken  over  a  large  number  of  samples  of, 

(system  output  ~  intended  signal  input)2 

72 


S+Nq  S 

( - Y) - 

72  72 


is  a  minimum.  In  effect,  the  adaptive  algorithm  is  minimizing  the 
interference  power  at  the  adaptive  noise  canceler  output  by  causing 
(via  tap  weight  adjustment)  the  adaptive  filter  output  Y  to  "look" 
like  the  interference  N0. 

The  adaptive  filter  frequency  response  can  be  controlled  by 
varying  its  tap  weights.  The  optimal  weight  vector  W*,  the  Wiener 
weight  vector,  that  minimizes  the  mean  squared  system  output  is 
given  by 

W*  =  R-1P  (8) 

where  R  =  Input  Correlation  Matrix 
and 

P  =  Cross-Correlation  Column  Vector 
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/ 


4  ,  Vk-i 

A- A,  4-i 


*  A-2  ,  • 

xk-lxk~2 , • 


xkxk-n 

xk-lxk-n 


'  xk-nxk,.  xk-nxk.-l ,  xk-nxk-2 ,  •  •  • , 


(9) 


The  symbol  E  means  that  the  matrix  R  is  composed  of  the  ex¬ 
pected  or  mean  values  of  the  indicated  products  of  adaptive  filter 
tap  outputs  (see  Figure  3) .  The  main  diagonal  terms  of  R  are  the 
mean  squares  of  the  tap  outputs.  The  off-diagonal  terms  are  the 
cross-correlation  among  the  tap  outputs. 

P  =  E  [  dj^Xfc.  dkxk-l/  ^kxk-2,  ^kxk-n  3 

where  d^  is  the  desired  response  at  "time"  k.  When  is  the 
reference  input  to  the  adaptive  noise  canceler  of  Figure  4,  d^  is 
the  primary  input.  In  terms  of  Figure  4's  notation: 


djc  =  S  +  N0 

J2  (11) 

xk  =  s  +  N0 

J2  (12) 


The  components  of  the  vector  P  are  the  cross-correlations 
between  the  desired  response  and  the  adaptive  filter  tap  outputs. 

Equations  8,  9,  and  10  can  be  used  to  investigate  the  influ¬ 
ence  of  the  interferer  and  the  intended  signal  on  the  optimal 
weight  vector  W*.  Of  particular  interest  are  those  conditions 
under  which  W*  and  hence  the  frequency  response  of  the  adaptive 
filter  are  only  a  function  of  the  interfering  signal.  This  is  what 
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will  allow  the  adaptive  filter  to  form  a  "bandpass"  around  the 
inter ferer  and  reject  the  intended  signal. 

A  typical  element  of  the  autocorrelation  matrix  (equation 
9)  is  E  [Xk_j_  .  Xk_ j  ] ,  i.e. ,  Rij  =  E  [Xk_j_  .  Xk_j] 
where : 

Xk  =  signal  input  to  the  adaptive  filter  at  time  k  or  at 
sample  k. 

Xk_^  =  total  signal  at  the  ith  tap  of  the  adaptive  filter. 

Xjc_ j  =  total  signal  at  the  jth  tap  of  the  adaptive  filter, 

k  is  a  time  index,  not  necessarily  a  unit  of  time. 

If  we  assume,  as  per  Figure  4,  that  the  input  to  the  adap¬ 
tive  filter  is 

*k  =  s  +  No 

72 

where : 

S  =  signal 

Nq  =  noise  or  interference 

then  E  (Xk_i  •  Xk_j]  =  1/2  E  [(S+N0)k_i  •  (S+N0)k_j]  (13) 

=  1/2  E  [(Sk_i  +  NOJc_.)  •  (Sk_j  +  N0]c_ j )  ] 

E  [Xk_i  •  Xk_j]  =  1/2  (E  [ (Sk_i  .  Sk_j )  (14) 

-  *  V.j) 

+  (Nok_i  *  Sk-j)  +  (Hofc.i  '  V-j)] 

E  [Xk_^  .  Xk_j)  =  1/2  (E  [Sk_i  .  Sk_j ]  +  E  [Sk_^  (15) 

+  E  S*-j 

+  E  tN°k-i  •  N°>c-j)> 

Interference  occurs  when  the  noise  is  much  larger  than  the  intended 
signal,  i.e.,  NQ  »  S.  We  shall  therefore  assume  that: 
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N0  »  S 


(16) 


The  last  term  in  equation  15  which  is  a  function  of  the  inter¬ 
ference  but  not  the  intended  signal  will  usually  be  the  largest 
term  in  the  equation  since  all  other  terms  are  expected  values  of 
products  containing  S  (the  intended  signal) .  Clearly  if  the  inter¬ 
ference  is  greater  than  the  signal  (N0  >>  S) ,  then  N02  >  N0  S  >  S2 
and  in  most  cases 


E  tN°k-i  • 

will  be  larger  than  either 
E  tsk-i  *  sk-jl' 

E  (sk-i  *  Nok_j3/  or 
E  fNok-i  *  Sk-jl* 

It  is  possible  for  Nok_^  and  Nok_j  to  90  de9tees  out  of 
phase  (for  narrow-band  deterministic  interference) .  In  this  case, 
E^N°k  i  *  N°k  not  be  lar9er  than  the  other  terms  in  equa¬ 

tion  15  and  the  sum  of  all  four  terms  would  be  of  order  NQS  which 
is  much  smaller  than  N02.  Every  element  of  the  autocorrelation 
matrix  R  is  either  dominated  by  the  interference  N0  or  is  small 
compared  to  it.  If  the  autocorrelation  matrix  R  is  dominated  by 
the  interference,  it  can  be  shown  that  R“-  will  also  be  dominated 
by  the  interference. 

The  Wiener  weight  vector  W*  that  minimizes  the  mean  square 


adaptive  noise  canceler  output  is  given  by  equation  8 .  The  preced¬ 
ing  analysis  has  shown  that  R  and  hence  R-1  are  dominated  by  or  are 
primarily  functions  of  the  interfering  signal.  If  it  can  be  shown 


that  P  the  cross-correlation  column  vector  is  also  dominated  by  the 
interfering  signal,  then  equation  8  will  imply  that  the  Wiener 
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weight  vector  W*  is  primarily  a  function  of  the  interference.  As 
mentioned  previously,  this  primary  dependence  of  W*  on  the  inter¬ 
fering  signal  is  what  will  allow  the  adaptive  filter  to  form  a 
bandpass  around  the  inter ferer:  pass  the  interferer  and  reject  the 
intended  signal. 

The  cross-correlation  vector  P  will  now  be  investigated. 

From  equations  10,  11,  and  12: 

P  =  E  [dk  Xk;  dk  dk  Xk_2/  *•••,  dk  Xk_n]  (10) 

dk  =  sk  +  N0k 
72 

Xk  =  Sk  +  N0k 
72 

also 

xk-i  =  sk-i  *  Nok_i 
72 

A  typical  element  of  P  is: 

Pi  =  E  [dk  •  Xk_i ]  (18) 

where  i  can  vary  between  0  and  n. 

Substituting  equations  11  and  12  into  equation  18  gives: 

Pi  =  1/2  E  (dk  •  Xk_i)  =  E  ( (Sk  +  N0k).(Sk_i  +  N0k_.)J  (19) 

=  1/2  E  [(Sk  Sk_i)  +  (Sk  N0k_.)  +  (M0k  Sk_i)  + 

(Nok  V.i)  1 

Pi  =  1/2  (E  [SkSk_iJ  -r  E  [SkN0k_.]  +  E(N0Sk_i]  +  (20) 

£(Nok  N0k-i]) 


(ID 


(12) 


(17) 
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The  analysis  of  equation  20  is  now  very  similar  to  the  analy¬ 
sis  of  equation  15.  Since  it  is  assumed  that  N0  >>  S,  the  last 
term  of  the  equation, 


E '% 


N 


°k-i 


3 


will  usually  be  the  largest  term  of  the  equation  because  all  the 
other  terms  are  expected  values  of  products  containing  S.  Even¬ 
tually  the  same  conclusion  will  be  reached  about  the  cross-corre¬ 
lation  vector  P  that  was  arrived  at  in  reference  to  the  autocorrel¬ 
ation  matrix  R,  that  is,  every  element  of  P  is  either  dominated  by 
Nq  or^  small  compared  to  it. 

Since  P  and  R-1  are  dominated  by  N0,  it  follows  from  equation 
8  that  W*,  the  optimal  weight  vector,  will  also  be  dominated  by  the 
interference.  The  inteference  "controls’'  the  optimal  weights. 

This  is  the  conclusion  that  was  to  be  established. 

The  adaptive  noise  canceler  circuit  of  Figure  4  works  when 
both  the  intended  signal  and  the  interferer  are  narrow-band.  When 
the  strong  interfering  input  to  the  circuit  is  a  random  wide-band 
signal,  the  canceler  will  not  be  able  to  filter  it  out.  The  PTF 
will  not  be  able  to  reject  the  weak  narrow-band  intended  signal  and 
pass  the  random  wide-band  interferer  (assuming  they  overlap  in 
frequency)  as  was  done  in  the  narrow-band  interferer  vs.  narrow- 
band  intended  signal  case  previously  discussed. 

If  the  PTF  could  put  a  passband  around  the  narrow-band  intend¬ 
ed  signal  and  filter  out  most  of  the  strong  random  wide-band  inter¬ 
ferer,  then  signal  separation  could  be  achieved.  In  effect,  this 
means  that  the  narrow-band  intended  signal  would  control  the  PTF 
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frequency  response  as  opposed  to  the  narrow-band  interferer  vs. 
narrow-band  intended  signal  case  where  the  interference  dominated 
and  controlled  t  .  PTF  frequency  response. 

If  an  appropriate  delay  is  placed  in  front  of  the  PTF  in  the 
adaptive  noise  canceler,  as  shown  in  Figure  5,  the  resulting  cir¬ 
cuit  is  known  as  an  Adaptive  Line  Enhancer  (ALE) .  This  circuit  is 
capable  of  putting  a  passband  around  a  narrow-band  intended  signal 
in  the  presence  of  a  strong  wide-band  random  signal.  As  a  result, 
it  is  capable  of  separating  these  two  types  of  signals.  In  the 
following  section  it  will  be  shown  why  an  ALE  works  this  way. 


23 


analysis  of  an  adaptive  line  enhancer 

Equation  8,  will  be  used  to  analyze  the  adaptive  line 
enhancer  shown  in  Figure  5.  The  analysis  will  show  that  the  ALE 
can  be  used  to  separate  the  following: 

CASE  1  -  A  weak,  random  broad-band  signal  from  a  strong 
narrow-band  interferer. 

CASE  2  ~  A  weak,  narrow-band  signal  from  a  strong  random, 
broad-band  interferer. 


For  both  cases: 

S  a  weak  intended  signal 

N0  a  strong  interferer  or  "noise"  where  N0  >>  S 

CASE  1 


S  =  weak,  random,  broad-band  signal 
N0  =  strong,  narrow-band  interferer 

Equation  10  for  P  the  cross-correlation  vector  has  as  its 
components  the  cross-correlations  between  the  desired  response  (d^) 
and  the  adaptive  filter  tap  outputs  (X]</  xk-l,****,  xk-n)  *  ^k 
the  input  to  the  positive  terminal  of  the  second  or  output  summer 
as  shown  in  Figure  5. 


Assuming;  that  the  input  power  is  evenly  split  between  the 
primary  and  reference  (upper  and  lower)  branches  of  the  ALE 
circuit: 


dk  =  Sk  +  Nc 


Where  Sk  and  N0^  indicate  that  each  of  these  signals  is  sampled  at 
the  time  corresponding  to  time  index  k. 

The  amplitude  that  wili  be  the  input  to  the  delay  element 
in  the  lower  branch  of  the  ALE  is  also  (Sk  +  N0k)//2.  The  delayed 
output  is  denoted  by  (DSk  +  DN0  V//2 ,  where  "D"  indicates  that  the 
signal  has  been  delayed  by  delta  (A)  units  of  time.  Thus 
(DSk  +  DN0k) /J2  is  the  input  to  the  adaptive  filter,  i.e., 

Xk  =  (DSk  +  DN0k)  /  72  (22 

The  signal  on  the  first  tap  of  the  adaptive  filter  is 


xk-l  *  (°sk-l  +  DNqj^)  /  ^2  (23) 

This  means  that  the  signal  out  of  the  first  tap  introduces  a  time 
delay  of  one  sample  period,  i.e.,  Sk-1  and  Nc  denote  Sk  and 
N0k  delayed  by  one  sample  period.  The  signal  amplitude  on  the  ith 
tap  is  Xk_j_ 

Xk-i  =  (DSk_i  +  DN0k_.)  /  J2  (24) 

A  typical  component  of  the  cross-correlation  vector  P,  such 
as  the  ith  component,  is  E  [dk  Xk_jJ  .  Equations  21  and  24  for  dk 
and  Xk_^  imply  that: 
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E  [  dkXk_j  ]  -  E 


(  V*V  (DSk.s>  DN^.) 


JT 


sr 


(25) 


“  1/2  E  [Sk  .  DSk_i  +  Sk  •  DN0k_.  +  N0k  •  DSk_i  +  N0r  •  DN0^.] 


E  (dk  •  Xk_i]  =  1/2  (E  fSk  •  DS k_ i ]  +  E  [Sk  •  DN0k_.] 
+  E  [N0k  .  DSk_i ]  +  E  (N0k  •  DNQk_.]) 


(26) 


The  purpose  of  introducing  a  delay  element  into  an  adaptive 
noise  canceler  to  form  an  ALE  is  to  decorrelate  the  wide-band 
component  of  the  input  from  itself.  If  the  delay  time  delta  (A)  is 
chosen  larger  than  the  autocorrelation  time  of  the  wide-band 
signal,  then  the  correlation  between  the  delayed  and  the  original 
-vri'de-band  component  will  be  zero  by  definition  of  autocorrelation 
time.  For  Case  1,  S  is  the  weak  random  broad-band  intended  signal. 
If  the  delay  time  A  is  larger  than  the  autocorrelation  time  of  S, 
then: 

E  [Sk  •  DSk_i ]  =  0  (27) 

for  all  i  or  equivalently  for  all  taps  of  the  adaptive  filter.  The 
left  side  of  equation  27  is  just  the  first  term  of  equation  26. 

The  analysis  now  becomes  very  similar  to  the  analysis  of  the 
cross-correlation  vector  of  an  adaptive  noise  canceler.  Since  it 
is  assumed  that  the  noise  or  narrow-band  interference  N0  is  much 
larger  than  the  signal  S  (N0  >>  S) ,  this  implies  that  in  most  cases 
the  last  term  of  equation  26  will  be  much  larger  than  either  of  the 
other  two  non-zero  terms,  i.e., 

E  [N°k  '  ™ok-i)  >  E  fsk  '  DNok-i)  <28> 
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E  [N0jc  •  DN0k_.]  >  E  [N0k  •  DSk_i] 


(29) 


and 


It  is  possible,,  however  that  N0  and  DN0k  may  be  90  degrees  out  of 
phase.  In  this  ease, 

E  tN°k-  DN°k-i] 

might  not  be  larger  than  the  other  terms  and  inequalities  28  and  29 
would  not  be  valid.  But  then  the  sum  of  all  three  non-zero  terms 
would  be  of  order  N0S,  which  is  much  smaller  than  N02.  Therefore, 
every  component  of  the  cross-correlation  vector  P  is  either  domi¬ 
nated  by  the  narrow-band  interference  N0,  via  inequalities  28  and 
29  and  equation  26  or  is  small  compared  to  N0. 

A  typical  element  of  the  autocorrelation  matrix  for  an  ALE  is 


E  txk-i  *  Xk-j]  =  V2  E  [(DSk_i  +  0NOk_.)  •  (DSk_j  +  DN0k_.)  ]  (30) 

E  CXk-i  *  Xk_ j ]  =  1/2  (E  [DSk_i  •  DSk_j]  + 

E  [DS^i  •  DH0k_ .  ]  +  E  [DN0k_.  •  DSk_ j  ]  +  (31) 

E  [DNok_.  •  DN0k.,]) 

Equation  31  is  very  similar  to  equation  15  for  a  typical  ele¬ 
ment  of  the  autocorrelation  function  of  an  adaptive  noise  canceler 
with  a  single  input.  The  only  difference  is  the  delay.  The  analy¬ 
sis  of  equation  31  is  exactly  the  same  as  equation  15.  Since  the 

interference  N0  is  much  larger  than  the  intended  signal,  every 
element  of  the  autocorrelation  matrix  R  for  an  ALE  is  either 
dominated  by  the  narrow-band  interference  or  is  small  compared  to 
it.  This  will  also  be  true  for  the  inverse,  R“l.  It  was  previ¬ 
ously  shown  that  this  is  also  true  for  the  ALE  cross-correlation 
matrix  P.  Therefore  equation  8  for  the  optimal  weight  vector 
W*  =  R_1P  implies  that  the  weight  vector  that  the  adaptive  filter 
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"converges"  to  is  primarily  a  function  of  the  interfering  narrow- 
band  signal,  N0. 

This  is  why  the  adaptive  filter  in  an  ALE  puts  a  "bandpass" 
around  the  interferer.  For  all  practical  purposes  it  never  "sees" 
(via  equations  8,  9,  and  10  for  W*,  R"1  and  P,  respectively)  the 
intended  random  wide-band  signal. 

In  other  words,  for  Case  1,  (a  weak  random  broad-band  signal 
and  a  strong  narrow-band  interferer)  it  has  been  shown  that  R  and  P 
(given  by  equations  9  and  10,  respectively)  are  primarily  functions 
of,  or  are  dominated  by  NQ>  Equation  8  then  implies  that  the 
optimum  weight  vector  W*  is  dominated  by  N0.  Equation  47  (see  Case 
2  analysis)  gives  the  frequency  response  H(w)  of  the  PTF  as: 
n 

H(w)  =  ■£  W.j_  e  i)wA(-i)  (-47) 

i=l 

where : 

H(w)  =  frequency  transfer  function 

(J  =  frequency 

A  =  intertap  delay 

n  =  Number  of  taps 

j  = 

The  frequency  response  H(w)  is  a  function  of  the  weights  W^.  The 
optimum  weight  vector  W*  is  primarily  a  function  of  N0  the  narrow- 
band  interferer.  A  consequence  of  the  domination  of  W*  by  N0  is 
that  when  W*  is  substituted  into  equation  47,  H(a>)  develops  a  peak 
or  maxima  around  the  frequency  of  the  narrow-band  signal.  It  is  in 
this  sense  that  the  PTF  frequency  response  never  "sees"  the  intend¬ 
ed  weak  random  broad-band  signal. 
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CASE  2 


Let  S  =  weak  narrow-band  intended  signal 
N0  =  strong  wide-band  random  interferer 

Equation  8,  W*  =  R-1  P,  was  again  used  to  analyze  the  ALE. 

The  ith  component  of  the  cross-correlation  vector  P  is  still  given 
by  equation  26.  Now  N0,  the  strong  interferer,  is  a  wide-band 
random  signal.  It  is  again  assumed  that  the  delay  time  A  is  chosen 
larger  than  the  autocorrelation  time  of  the  wide-band  random 
signal.  As  a  result,  correlation  between  the  delayed  and  original 
wide-band  random  signal  will  be  zero,  i.e., 


E  [N°*  '  “W  ■ 0 


(32) 


Thus,  for  case  2,  E  [NOJc  *  DNQ^  ^3  is  not  the  dominant  term 
in  equation  26  that  it  was  for  case  1  and  in  fact  it  makes  no 
contribution  to  equation  26. 

Thus,  by  the  introduction  of  an  appropriate  delay  time  A,  the 
influence  that  E  [N0^  •  N0^  had  in  aquation  20  for  the  cross¬ 
correlation  matrix  element  for  an  adaptive  noise  canceler  with 
single  input  becomes  nullified.  Since  the  interference  N0  is 
assumed  to  be  much  larger  than  the  intended  signal  S, 


E  [N°k  •  N°k-iJ 

for  the  adaptive  noise  canceler  with  single  input  or 
E  [N0k  •  DN0k_.) 

for  an  ALE  has  the  potential  to  be  the  dominant  term  in  equation  20 
or  26,  respectively.  The  elimination  of  the  left-hand  side  of 
equation  32  is  the  major  effect  that  the  time  delay  in  the  ALE 
produces. 
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The  interferer  can  only  contribute  to  the  cross-correlation, 
element  via  the  second  and  third  terms  of  equation  26, 


E  [Sk  •  DN0k_.]  and  E  [N0k  •  DS^jJ . 

However,  since  N0  >>  S,  these  terms  will  be  many  orders  of 
magnitude  smaller  than  E  CNok_j_  *  Nok  jJ  or  E  ^Nok*DNok  ^3/  where 
the  intertap  delay  A  is  not  chosen  long  enough  to  decorrelate  nQi. 

In  the  ideal  case,  if  there  is  no  correlation  between  the  signal  S 
and  the  interference,  then  both  E  [Sk  •  DN0k  and  E  [N0k  •  DSk_jJ 
will  equal  zero.  Then  in  equation  26  only  the  first  term, 

£  [Sk  •  DSk_jL],  will  be  non-zero.  This  term  is  a  function  of  only 
the  intended  signal,  not  the  interference.  So  if  a  weak  narrow- 
band  intended  signal  and  a  strong  wide-band  random  interference  are 
uncorrelated,  the  cross-correlation  vector  is  only  a  function  of 
the  intended  signal,  not  the  interference. 

Thus,  for  an  ALE  an  appropriate  time  delay  will  minimize  the 
effect  of  the  wide-band  random  interference  on  the  cross-correla¬ 
tion  vector  P.  Since  W*  =  R-1P,  it  is  necessary  to  know  how  inter¬ 
ference  and  the  time  delay  affect  R,  the  autocorrelation  matrix  and 
its  inverse  R-1.  A  typical  element  of  the  autocorrelation  matrix 
for  an  ALE  is  given  by  equations  30  and  31.  The  last  term  in 
equation  31  is  potentially  the  largest  term  since  N0  »  S.  This 
term  gives  the  major  effect  of  the  interfering  signal  on  the  auto¬ 
correlation  matrix. 

The  conditions  under  which  the  last  term  in  equation  31, 

E  [°Nok  ^  ’  DNok  j3  is  2ero  or  relatively  small  will  now  be  inves¬ 
tigated. 
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Assume  that  the  random  wide-band  interference  is  white  noise, 
i.e.,  with  a  power  spectral  density  that  is  constant  (say  C)  for 
all  frequencies.  It  can  be  shown-*  that  the  Fourier  transform  of 
the  power  spectral  density  of  this  white  noise, the  autocorrelation 
function  R  (r),  is  the  same  constant  times  a  delta  function 

6(r)  i.e.  R(r)  =  C <S(r)  .  (33) 


Equation  33  implies  that  R(r)  is  equal  to  zero  except  for  r  =  0. 
This  means  that  for  a  white  noise  signal  N(t),  N(t)  and  N(t  +  r) 
are  uncorrelated  and  independent  no  matter  how  small  r  becomes. 

The  fourth  term  of  equation  31,  E  [DN0^  ^  •  DN0^_ j ] ,  is  basi¬ 
cally  the  autocorrelation  of  the  delayed  interference  input  (DN0^) 
to  the  adaptive  filter  of  the  ALE.  The  correlation  is  performed 
between  the  ith  and  jth  taps  of  the  filter.  It  correlates  the 
interference  output  that  appears  at  the  ith  and  jth  taps  using  a 
correlation  delay  that  is  equal  to  the  propagation  delay  between 
the  two  taps. 

If  it  is  assumed  that  N0  is  white  noise,  then  equation  33 
implies  that 


E  [DN°k-i  '  DN°k-j]  =  °  When  1  +  3  (34) 

and  that 

E  [DN°k-i  *  DN°k-j3  =  °  When  1  =  j  (35) 

If  it  is  further  assumed  that  the  signal  S  and  the  interference  N0 

are  uncorrelated,  then  the  second  and  third  terms  of  equation  31 

are  zero  for  all  values  of  i  and  j. 

Thus,  for  white  noise  interference,  the  autocorrelation 

matrix  given  by  equation  31  is  as  follows: 

for  the  off  diagonal  elements  (i  =f=  j)  equation  34  implies  that: 
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Rj_j  =  1/2  E  [QS]^_j[  •  DSk_ j ] 
for  the  diagonal  elements  (i=j): 

RM  h  E  [Xk_i  •  Xk_i]  =  1/2  (E  [DSk_i  •  DSk_i] 

+  5  [DN°k-i’  DN9k-i” 

Substituting  equation  35  into  equation  37  implies: 

Bii  =  i/2  (  E  [DSk_j_  •  DS k—  3  +  c) 
and  since  N0  »  S, 


E  [DN°k-i  *  DNOk-i]  >:>  F  [DSk~i  *  DSk-i] 

Inequality  39  when  substituted  into  either  equation  37  or  38 

implies  that 


(36) 


(38) 


(39) 


Rii  =  C/2  (APPROXIMATELY)  (40) 

It  follows  from  inequality  39  and  equation  40  that  the  off 
diagonal  elements  (given  by  equation  36)  are  small  compared  to 
the  diagonal  elements.  Expressed  as  an  inequality; 


Rii  »  Rij  (41) 

Inequality  41  and  equation  40  imply  that  for  white  noise 
interference,  the  autocorrelation  matrix  R  can  be  approximated  by 
a  matrix  that  is  both  diagonal  and  scalar  (a  scalar  matrix  is  a 
diagonal  matrix  whose  diagonal  elements  are  all  equal) 


C/2,0, . ,0 

0, C/2,0, . ,0 


(42) 


0  . . .  C/2 
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If  a  matrix  is  scalar,  its  inverse  will  also  be  scalar.  So 
R“i  can  be  expressed  as  follows: 


where  K  is  some  function  of  C,  the  power  spectral  density  of  the 
wide-band  random  inter ferar,  K  can  be  factored  out  of  equa¬ 
tion  43  to  give: 


The  matrix  in  equation  44  is  the  identity  matrix  I.  Equation  44 
now  becomes: 

R"1  =  KI  (45) 

Where  K  is  a  scalar  or  number  not  a  matrix. 

Substituting  equation  45  into  equation  8,  W*  =  R-1  P,  for  the 
optimal  weight  vector  of  the  adaptive  filter  gives 

W*  =  KIP  =  KP  (46) 

since  IP=P. 
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Equation  46  can  be  used  to  investigate  the  frequency  transfer 
function  of  the  adaptive  filter.  The  adaptive  filter  is  a  tapped 
delay  line  of  transversal  filter.  The  frequency  response  of  a 
tapped  delay  can  be  shown4  to  be 
n  . 

H(w)  =  E  W ie3uA-("1)  (47) 

i=l 


where: 

H(w)  =  frequency  transfer  function 
cj  =  frequency 
A  =  intertap  delay 
n  =  number  of  taps 

j  =  y=r 

Substituting  equation  46  into  equation  47  gives: 

H'(o)  =  E  Wie"3tJAi  =  E  KPje-^1 
i=l  1~1 


(48) 


n 

H((J)  =  K  E  P4e"iwAi 
1=1 


(49) 


Equation  49  indicates  that  K  (and  hence  C)  does  not  affect 
the  relative  frequency  response,  i.e., 

n  .  .  n  ... 

H  (cj-l)  K  E  Pie"3(£Jl)Al  E  P^e-!)  (wl)  Al 

i=l  i=l 

-  =  -  =  -  (50) 

n  .  n 

H  (w2)  K  E  Pje-3 (w2)Al  E  Pie_3(w2)Ai 

i=l  i=l 

K  is  just  a  multiplicative  or  scale  factor  in  equation  49.  It 
cannot  affect  the  relative  frequency  response,  H(w1)/H(«2)/ 
because  it  cancels  out  in  equation  50.  Thus,  for  a  white  noise 
interferer  uncorrelated  with  the  signal,  the  use  of  the  optimal 
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Weights  W*  for  the  adaptive  filter  in  the  ALE  causes  the  relative 
frequency  response  to  be  determined  by  the  cross-correlation  vector 
P.  However,  it  was  previously  shown  that  for  an  ALE,  an  appropri¬ 
ate  time  delay  will  minimize  or  possibly  eliminate  the  effect  of 
the  wide-band  random  interference  on  the  cross-correlation  vector 
P.  P  will  be  determined  by  S,  the  weak  narrow-band  signal  (assum¬ 
ing  that  the  signal  S  and  the  interference  nQ  are  uncorrelated) . 

The  signal  S  will  determine  the  relative  frequency  response  (via 
equation  50),  i.e.,  S  will  determine  the  frequency  response  up  to  a 
scale  factor.  The  interferer  N0  will  determine  the  scale  factor  K 
(K  is  a  function  of  C  the  power  spectral  density  of  N0) . 

Therefore,  for  white  noise  interference,  it  is  the  weak 
narrow-band  intended  signal  that  determines  what  frequencies  are 
passed  or  rejected  by  the  adaptive  filter.  This  is  why  the  adap¬ 
tive  filter  (for  Case  2)  can  put  a  "bandpass"  around  the  signal  S 
and  later  subtract  it  from  S  +  N0  at  the  summer. 

The  key  assumption  in  the  above  analysis  was  that  the  wide¬ 
band  random  interferer  was  white  noise.  White  noise  uncorrelated 
with  the  signal  implies  that  R  (via  equation  42)  and  R_1  (via 
equations  43  and  44)  are  scalar  matrices.  The  scalar  matrix  R-1 
implies  equation  46:  W*  =  KP.  Equation  46  implies  that  the 
relative  frequency  response  is  determined  by  P.  But  the  correla¬ 
tion  vector  P  is  determined  by  the  signal  S.  Thus  it  was  concluded 
that  the  relative  frequency  response  is  determined  by  the  intended 
narrow-band  signal. 

It  will  now  be  determined  whether  or  not  the  conclusion,  that 
the  relative  frequency  response  of  the  adaptive  filter  is  deter- 
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mined  by  the  intended  narrow-band  signal,  is  still  valid  if  the 
wide-band  random  interferer  is  not  white  noise.  If  R1  and  R-1 
still  remain  scalar  matrices,  then  the  conclusion  will  remain 
valid.  A  typical  element  of  the  autocorrelation  matrix  R  for  an 
ALE  is  given  by  equation  31.  If  the  noise  and  the  signal  are 
uncorrelated  equation  31  becomes: 

*ij  =  1/2  (E  [DSk_i  •  DSk_j]  +  E  [DN0k_.  •  DNQk_  ] )  (51) 

If  it  is  assumed  that  N0  >>  S  then  the  diagonal  terms  of  equa¬ 
tion  51  are  given  by 

R'ii  =  V2  E  [DN0k_.  •  DN0k_.]  (52) 

R^i  is  a  measure  of  the  energy  at  tap  i  of  the  adaptive,  filter-. 
Assuming  that  the  same  energy  appears  at  each  tap,  then  Rj^  will 
have  the  same  value  for  all  i,  i.e., 

Rai  =  r22  =  •  •  •  =  %n  (53) 

The  off-diagonal  elements  of  R  are  still  given  by  equation 

51  since  N0»-S.  The  first  terms  of  equation  51  will  be  small 
compared  to  the  diagonal  elements,  i.e., 


E  (DN0lc_.  -  DN0k_.  ]  »  E  [DSk_i  •  DSk_j] 

If  the  second  term  of  equation  51  is  also  small  compared 
to  the  diagonal  elements,  i.e.,  if: 


E  [DN°k-i  '  DN°k-i]  >>  E  [DN°k-i  '  DNk"j]  (55) 

then  equations  51,  52,  53,  54,  and  55  imply  that  the  autocorrela¬ 
tion  matrix  R  can  be  approximated  by  a  scalar  matrix. 


Therefore,  the  conclusion  that  the  relative  frequency  response  is 
determined  by  the  intended  narrow-band  signal  remains  valid. 

The  key  assumption  above  was  inequality  55.  It  shows  that  the 
delayed  random  wide-band  interference  NQ  must  significantly  decor¬ 
relate  between  the  ith  and  jth  taps  of  the  adaptive  filter  for  R  to 
be  approximated  by  a  scalar  matrix.  Since  i  and  j  can  take  on  any 
values,  except  i  =  j,  the  delayed  random  wide-band  interference 
must  significantly  decorrelate  over  one  intertap  delay  time  in 
order  for  R  to  look  like  a  scalar  matrix.  This  will  insure  that 
the  relative  frequency  response  is  determined  by  the  intended 
signal  and  hence  that  the  ALE  will  put  a  "bandpass"  around  the 
intended  signal. 

It  is  important  to  note  that  for  both  case  1  (S  =  weak  random 
wide-band  intended  signal,  N0  =  strong  narrow-band  interferer)  and 
case  2  (S  =  weak  narrow-band  intended  signal,  N0  =  strong  random 
wide-band  interferer)  it  is  the  narrow-band  signal  that  the  adap¬ 
tive  filter  puts  a  "passband"  around.  Intuitively  this  makes 
sense.  A  narrow-band  deterministic  signal  can  be  subtracted  from 
the  sum  of  the  same  narrow-band  deterministic  signal  and  a  wide¬ 
band  random  signal.  The  narrow-band  signals  can  cancel  out. 
Subtracting  a  wide-band  random  signal  from  that  same  sum  will  not 
cancel  out  the  wide-band  random  signal.  Randomness  will  prevent 
cancellation. 


37 


MEAN  SQUARE  ERROR  AS  A  PERFORMANCE  MEASURE  FOR 
ADAPTIVE  ALGORITHMS 

Before  adaptive  algorithms  can  be  investigated,  a  performance 
measure  or  performance  function  for  the  adaptive  filter  must  be 
defined.  A  very  useful  and  well  understood  performance  function 
evaluated  in  this  paper  is  Mean  Square  Error. 

The  generation  of  an  adaptive  filter  error  signal  is  illus¬ 
trated  in  Figure  6.  The  sampled  output  of  the  adaptive  filter 
is  subtracted  from  a  sampled  desired  signal  response  to  generate 
an  error  signal.  The  "desired"  response  will  not  usually  be  the 
intended  signal  that  is  being  sought  to  detect.  If  the  intended 
signal  was  known  there  would  be  no  need  for  an  adaptive  filter  to 
detect  it.  The  "desired"  response  must  be  related  to  the  intended 
signal  in  some  manner.  For  the  case  of  an  adaptive  noise  canceler 
(illustrated  in  Figure  2)  the  "desired"  response  is  the  primary 
input,  i.e.,  the  intended  signal  S  plus  the  interference  NQ. 

By  taking  the  square  of  the  adaptive  filter  error  function 
o  , 

ek'  zk  W1^  never  be  negative  and  will  therefore  possess  a  minimum 
value. 

The  adaptive  filter  should  be  able  to  work  with  random  input 
signals  and  random  "desired"  responses  as  well  as  with  determin¬ 
istic  signals  because  communications  signals  are  often  modeled  as 
random  signals.  This  suggests  that  an  appropriate  performance 
function  for  an  adaptive  filter  would  be  the  average  or  mean  of  the 
squared  error  (denoted  by  ^[e^])  .  Mean  square  error  can  also  be 
interpreted  as  the  average  power  of  the  error  signal  in  Figure  5. 


<S 
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Figure  6.  Generation  of  an  Adaptive  Filter  Error 


The  mean  square  error  as  a  function  of  input  signal,  "desired" 
response,  and  tap  weights  can  be  derived  using  the  following,  defi¬ 
nitions.  The  error  signal  £k  at  time  index  k  is  defined  as: 

£k  =  dk  "  Yk  (56) 
The  output  of  the  PTF  is  given  by: 


Yk-  =  wo  xk  +  W1  xk-l  +  w2  xk-2  +  •  •  •  +  wn  xk-n 


If  the  column  vectors  W  and  Xk  are  defined  by 


xk 

xk-l 


xk-n 


then  equation  57  can  be  expressed  as  the  vector  dot  product  of 


W  and  Xk  : 


*k  -  wT  •  Xk 

— *■  -» 

where  WT  is  the  transpose  of  W,  i.e.,  WT  is  a  row  vector. 
Equation  58  can  also  be  expressed  as: 

Vk  =  XkT  .  H 

Substituting  equations  58  and  59  into  equation  56  gives: 


t.'k  =  dk  —  Xk^  •  W  =  djr  —  WT  .  xk 
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Now  square  equation  60  to  get: 

•4  — ►  — ♦ 

t\  =  (dk  -  XkT  .  W)  (dk  -  WT  .  xk)  (61) 

— r  -4  — *  —4  -4  — ► 

z\  ~  “  dk  wT  *  xk  “  ^kT  *  w  +  (^kT  *  w)  (wT  *  xk) 

-4-4—4  -4  -4-4 

e|  =  d|  +  (WT  .  xk)  (XkT  •  W)  -  2  dk  (XTk  •  W)  (62) 

The  second  term  of  equation  62  can  be  written  as 


(  WT  ♦ 

—4  —4 

Xk)  *  (Xk 

T  .  W)  =  WT  • 

[Xk  xkT)  w 

where  (Xk  XkT]  is  a 

matrix  given 

by: 

CM-^ 

1  _ 

xkxk-l  , 

xkxk-2  , • • •  , 

xkxk-n 

—4  —4 

:><k  XkT)  - 

Xk~lXk, 

• 

X2 

x  k- 1  / 

xk-lxk-2,**‘  , 

xk-lxk-n 

xk-nxk, 

xk-nxk-l  , 

xk-nxk-2,’**  , 

x2 

xk-n 

Substituting  equation  63  into  equation  62  gives: 

-4  -4-4-4  -4  —4 

e|  =  d|  +  WT  •  [Xk  XkT]  W  -2  dk(XkT  •  W)  (65) 

-4 

If  it  is  assumed  that  ck^  dk  and  Xk  are  statistically  stationary 
(i.e.,  statistical  characteristics  are  independent  of  time)  and  W 
is  held  constant,  then  taking  the  expected  value  of  equation  62 
over  the  time  index  k  yields  the  following  expression  for  mean 
square  error  (MSE) : 
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MSE  =  E  (  e|  ]  =  E  [  d|  ]  +  WT  •  E  [XkXkT]  W 

-f 

-  2  E  [dkXkT]  •  W  (66) 

where  E  denotes  the  expected  or  mean  or  average  value  of  the  quan- 

— ►  — ¥ 

tity  in  brackets.  In  equation  66;  E  [XkXkT]  is  just  the  input 
autocorrelation  matrix  R  (see  equation  9)  and  E  [dkXk^]  is  just 

— f 

the  cross-correlation  vector  P  (see  equation  10) .  Equation  66  then 
becomes: 

MSE  =  E  (  )  =  E  [d|]-+  WT  •  R  W  -  2PT  •  W  (67) 

It  is  obvious  from  equation  67  or  66  that  MSE  is  a  quadratic 
function  of  the  components  of  the  weight  vector  W,  i.e.,  the  compo- 
nents  of  W  appear  in  equation  67  or  66  raised  either  to  the  first 
or  second  power.  This  implies  that  when  MSE  is  plotted  against  all 
the  tap  weights  the  result  is  a  hyper  paraboloid.  If  there  are  n 
taps  in  the  PTF  then  a  plot  of  MSE  versus  tap  weights  yields  an 
(n  +  1)  dimensional  "parabola."  This  plot  is  known  as  a  perfor¬ 
mance  surface. 

An  n  +  1  dimensional  parabola  can  be  thought  of  as  an  (n  +  1) 
dimensional  "bowl".  This  "bowl"  must  be  concave  upward;  otherwise 
there  would  be  weight  settings  that  would  result  in  a  negative  MSE 
(i.e.,  negative  average  error  signal  power).  This  is  impossible 
with  real  physical  signals.  Since  the  MSE  is  a  quadratic  function, 
this  implies  that  there  is  a  single  point  at  the  bottom  of  the  MSE 
performance  surface  "bowl."  This  point  is  the  minimum  MSE.  The 
objective  of  all  adaptive  algorithms  is  to  drive  the  weights  and 
the  resulting  MSE  toward  this  point. 
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Equation  8  for  the  optimal  weight  vector  W*  provides  a  direct 
method  of  locating  the  bottom  of  the  MSE  performance  surface  bowl. 
When  we  assume  a  weight  vector  W  =  W*  then  the  mean  square  error  is 
at  its  minimum.  This  is  known  as  the  direct  or  matrix  inversion 
algorithm.  This  algorithm  has  several  severe  drawbacks  associated 
with  it: 

1.  If  the  PTF  has  n  taps,  then  (n+1)  (n+4)  /  2  autocorrelation 
and  cross-correlation  measurements  must  be  made  in  order  to  deter 
mine  R  and  P.  Such  measurements  must  be  repeated  whenever  the 
input  signal  statistics  change  with  time. 

2.  The  autocorrelation  matrix  must  then  be  inverted. 

3..  "Implementing  a  direct  solution  requires  setting  weight  values 
with  a  high  degree  of  accuracy  in  open  loop  fashion,  whereas  a 
feedback  approach  provides  self  correction  of  inaccurate  settings 
thereby  giving  tolerance  to  hardware  error."5  In  other  words, 
because  equation  8  has  no  feedback  from  the  error  output,  highly 
accurate  weight  values  are  required. 

When  the  number  of  weights  is  large  or  the  input  data  rate 
(or  hopping  rate  for  frequency  hopping  radios)  is  high,  then  1  and 
2  above  imply  severe  computational  and  time  requirements  on  any 
direct  solution.  The  processor  implementing  a  matrix  inversion 
algorithm  might  not  be  able  to  implement  it  fast  enough  for  the 
algorithm  to  be  of  any  use.  Because  of  these  problems,  no  adaptive 
algorithms  that  require  the  measurement  of  an  autocorrelation 
matrix  or  the  computation  of  its  inverse  were  investigated. 
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Two  types  of  adaptive  algorithms  that  do  not  require  any 
knowledge  of  the  autocorrelation  matrix  are  the  methods  of  Steepest 
Descent  and  Random  Search. 
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METHOD  OF  STEEPEST  DESCENT 

Before  introducing  the  method  of  steepest  descent  for  an 
arbitrary  number  of  tap  weights  (or  equivalently  an  arbitrary 
number  of  dimensions  in  the  mean  square  error  performance  surface) 
it  is  helpful  to  consider  the  method  of  steepest  descent  for  the 
simplest  case:  just  one  weight. 

The  one  weight  (univariable)  performance  surface,  which  is  a 
parabola,  is  shown  in  Figure  7. 

The  method  of  steepest  descent  does  not  require  knowledge  of 

•4 

the  autocorrelation  matrix  R  or  the  cross-correlation  vector  P. 
Since  R  and  P  are  unknown,  equation  67  cannot  be  used  to  define 
the  MSE  performance  surface.  But  since  mean  square  error  can  also 
be  interpreted  as  the  average  power  of  the  error  signal,  MSE  can  be 
measured. 

In  order  to  find  W*,  the  weight  that  causes  the  MSE  to  be 
minimized,  an  arbitrary  weight  value  WQ  is  initially  assumed.  The 
average  power  of  the  error  signal  is  then  measured  in  order  to 
determine  the  MSE  at  Wc,  i.e.,  one  point  on  the  MSE  performance 
"surface"  shown  in  Figure  7  has  been  located.  The  ability  to 
locate  points  on  the  MSE  performance  "surface"  allows  measurement 
of  the  slope  of  the  parabola  at  WQ  (the  method  by  which  the  slope 
is  measured  depends  on  the  type  of  steepest  descent  algorithm 
used) . 

A  new  weight  value  is  then  chosen  equal  to  the  initial 
value  WQ  plus  an  increment  proportional  to  the  negative  of  the 


45 


lirface 


slope  at  WQ 

Wi  =  WQ  +  M  (-slope)  (68) 

The  point  on  the  performance  surface  corresponding  to  is 
lower  down  on  the  parabola  than  the  point  corresponding  to  WQ.  it 
is  closer  to  the  minimum  than  the  first  point.  Another  new  value, 

W 2,  is  then  derived  in  the  same  way  by  measuring  the  slope  of  the 
parabola  at  W1;  i.e., 

w2  =  W1  +  ^  (“Slope)  (69) 

This  procedure  is  repeated  until  the  slope  of  the  parabola  at 
the  iterated  point  is  zero.  It  is  obvious  from  Figure  7  that  when 
the  slope  of  the  parabola  is  zero,  then  W*,  the  weight  that  causes 
the  MSE  to  be  minimized,  has  been  identified.  To  summarize,  for  a 
one  weight  filter  with  a  parabolic  error  surface,  the  negative  of 
the  slope  of  the  parabola  is  used  to  "slide"  down  to  the  bottom  of 
the  "bov/1." 

For  a  filter  with  n  taps  and  an  n  +  1  dimensional  hyper  para- 
boidal  mean  square  error  surface,  the  objective  is  still  to  "slide" 
dov/n  the  error  surface  to  the  bottom  of  the  "bowl." 

In  order  to  identify  (at  any  given  point  on  the  MSE  surface) 
the  direction  in  which  to  slide,  the  negative  gradient  vector  of 
the  MSE  surface  is  used.  The  gradient  of  the  MSE  surface  at  a 
given  point  on  the  surface  gives  the  direction  in  which  the  MSE  is 
increasing  fastest  at  that  point.  The  negative  of  the  gradient  is 
the  direction  in  which  the  MSE  is  decreasing  fastest.  It  points 
the  way  to  the  steepest  (and  "fastest")  descent  down  the  MSE 
"bowl."  Hence  the  name  "Method  of  Steepest  Descent." 


The  gradient  7  of  the  MSE  surface  is  defined  as  the  vector 


i.e.,  each  component  of  7  is  a  partial  derivative  of  the  MSE  with 
respect  to  a  given  weight. 

The  method  of  steepest  descent  can  be  expressed  by  the  fol¬ 
lowing  algorithm: 

wk+l  =  Wk  +  n  (-7k)  (71) 

where 

Wk  =  the  weight  vector  at  the  kth  iteration,  i.e.,  the 
set  of  tap  weights  used  on  the  kth  iteration. 

Wk+i  =  the  weight  vector  at  the  k+lth  iteration 

7k  =  the  gradient  at  the  kth  iteration  point  on  the  MSE  per¬ 
formance  "surface" 

H  =  a  constant  that  regulates  the  step  or  increment  size  of 
the  weight  vector  change.  It  determines  how  far  to 
"slide"  down  the  performance  surface  before  another 
iteration  is  performed. 

Equation  71  is  a  direct  generalization  of  the  one  dimensional  case 
(equations  68  and  69) .  For  any  given  set  of  tap  weights, 

Wk,  a  new  set  Wk+1  can  be  computed  (via  equation  71)  that  yields  a 
smaller  mean  square  error.  In  order  to  use  equation  71  it  must  be 
possible  to  compute  the  gradient  7k  at  the  kth  iteration  point. 

The  manner  in  which  the  gradient  is  computed  depends  on  the  spe¬ 
cific  steepest  descent  algorithm  that  is  used.  All  steepest  de¬ 
scent  algorithms,  however,  use  the  fact  that  mean  square  error  can 
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be  interpreted  as  the  average  power  of  the  error  signal  to  locate 
points  on  the  MSE  performance  surface  and  to  ultimately  use  these 
points  to  compute  the  gradient.  To  summarize  equation  71,  the 
defining  equation  for  the  method  of  steepest  descent,  allows  an 
iterative  approach  to  the  optimal  weight  vector  W*  without  any 
knowledge  of  the  autocorrelation  matrix  R  or  the  cross-correlation 
vector  P.  The  only  prerequisite  for  using  equation  71  is  the 
ability  to  measure  average  error  signal  power. 
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GRADIEitl’  ESTIMATION 


The  two  most  widely  used  methods  for  estimating  the  gradient 
at  a  given  point  on  the  mean  square  error  surface  s~e:  the  Dif¬ 
ferential  Steepest  Descent  (DSD)  algorithm  and  Widruw's  Least  Mean 
Square  ( LMS)  algorithm. 
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DIFFERENTIAL  STEEPEST  DESCENT  ALGORITHM 


In  the  DSD  algorithm,  each  of  the  partial  derivatives  in 
equation  70  are  estimated  by  the  method  of  symmetric  differences 
illustrated  in  Figure  8.  To  calculate  3(MSE)/dWj_  at  a  given  value 
of  Wj_  =  WG^ven,  all  the  weights  except  Wj_  are  held  constant.  As 
per  Figure  8,  the  mean  square  error  is  "measured"  at 

wi  =  wGiven  +  ^  anc*  at  wi  =  wGiven  ~  The  si°Pe  °f  the  line  be¬ 
tween  the  two  points  is  then  calculated  via  equation  72 


slope 


MSE(WGiven  +  6)  -  MSB  (WGiven  -  S) 
2  6 


(72) 


This  slope  is  an  approximation  of  3 (MSE)  /  3w^  at  =  WGiVen. 

The  MSE  terms  in  equation  72  above  are  just  estimates  of  the 
true  MSE  based  on  measurement  of  the  average  error  signal  power. 
There  will  be  an  error  associated  with  each  MSE  measurement.  This 
means  that  3 (MSE)  /  3W^  given  by  the  slope  in  equation  72  will  have 
an  error  associated  with  it.  Since  6  is  small,  MSE  (WG-[ven  +  5) 
and  MSE  ( ^Given  ~  will  be  very  close  to  each  other.  When  the 
two  MSE  values  are  subtracted,  r  ;  in  equation  72,  the  resulting 
error  (on  a  percentage  basis)  becomes  greatly  magnified.  The  only 
way  to  reduce  this  subtraction  or  slope  error  is  to  reduce  the  MSE 
error.  This  is  done  by  repeated  MSE  measurement  at  both  WG^ven  +  <5 
and  at  WG^ven  -6.  In  other  words,  the  error  signal  average  power 
must  be  measured  M  times  at  both  WG^ven  +  6  and  at  WG^ven  -  6;  M 
will  be  determined  by  the  accuracy  requirements  of  the  particular 
application.  Therefore,  DSD  algorithm  requires  2M  error  signal 
average  power  measurements  per  tap  per  iteration. 
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e  8. 


In  the  DSD  algorithm,  once  the  gradient  has  been  approximat¬ 
ed  (via  equation  70)  by  the  method  of  symmetric  differences  it  is 
substituted  into  the  defining  equation  (equation  71)  for  the  method 
of  steepest  descent  and  a  new  set  of  tap  weights  are  calculated. 
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LEAST  MEAN  SQUARE  (LMS)  ALGORITHM 


In  the  LMS  or  Widrow's  algorithm  it  is  assumed  that  the  adap¬ 
tive  filter  is  an  adaptive  linear  combiner  (see  Figure  9) .  If 
data  are  acquired  and  input  in  parallel  to  an  adaptive  linear 
combiner,  the  structure  in  Figure  9a  is  used.  For  serial  data 
input  the  structure  in  9b  is  used.  Note  that  Figure  9b  is  just  a 
tapped  delay  line  or  transversal  filter.  It  is  further  assumed 
that  a  "desired"  response  signal  is  available.  These  two  assump¬ 
tions  were  not  made  for  the  DSD  algorithms.  So  DSD  is  more 
general  than  LMS,  i.e.,  it  is  not  tied  to  a  single  filter  struc¬ 
ture..  LMS  is  only  applicable  to  the  adaptive  linear  combiner. 

In  the  LMS  algorithm,  each  of  the  partial  derivatives  in 
equation  70  can  be  estimated  by  assuming  that  the  mean  square 
error  (MSE)  can  be  estimated  by  a  single  measurement  of  the  error, 
i.e. , 

MSE  ~  (73) 

where  =  single  measurement  of  the  error  at  the  kth  iteration. 
Equation  73  is  the  key  assumption  in  the  LMS  algorithm.  Substitut¬ 
ing  equation  73  into  equation  70  results  in: 


dwQ 


a<  4  > 

awx 


5(  e£  ) 


aw. 


n 


(74) 


where  is  the  gradient  of  the  MSE  performance  surface  at  the  kth 
iteration  point. 

de|  de|  •  dt k  2  dt ^  (75) 

dW±  de^  dWj  dW± 
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onse 


Adaptive  Linear  Combiner:  (a)  In  General  Form; 
Vs  a  Transversal  Filter  (From  Ref.  1,  Page  101) 


Since  an  adaptive  linear  combiner  filter  structure  was  as¬ 
sumed,  this  implies  that: 

N 

=  dk  -  S  Xki  Wki  (76) 

i=o 

where  Xk^  =  signal  at  tap  i  during  the  kth  iteration 

Wki  =  tap  weight  at  tap  i  during  the  kth  iteration. 

Taking  the  derivative  of  equation  76  implies 

dck 

—  =  -Xki  (77) 

dwki 

In  equation  77,  in  order  to  be'-  consistent  with  equation 
75,  we  will  change  Xk£  to  X^  and  Wkj_  to  Wj_.  Equation  77  then 
becomes: 

dck 

—  =  -Xi  (78) 

dWi 

Substituting  equation  78  into  equation  75  gives: 

-  =  -2ckXi  (79) 

3Wi 

Substituting  equation  79  into  equation  74  gives: 

Vk  ==  [~2£k  X0f  “2 £ k  XX>  ••*,  ~2Ek  ^n3  =  2ckxk  (80) 

where  Xk  =  [XQ(  Xlf  ••••,  xn]'  i .e.,  Xk  is  a  vector  representing 
the  tap  values  at  the  kth  iteration. 

The  method  of  steepest  descent  is  defined  by  equation  71: 
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(71) 


wk+l  =  wk  +  M  (~vk) 

Substituting  equation  80  into  equation  71  gives: 

*>♦  -♦ 

Wk+1  =  Wk  +  2fi  £kXk  (81) 

Equation  81  is  the  LMS  algorithm. 

The  LMS  algorithm  is  very  easy  to  compute,  and,  given  the 
right  hardware,  it  can  be  done  very  quickly.  It  does  not  require 
off-line  gradient  estimation  or  repetitive  error  measurements  as  in 
the  DSD  algorithm.  In  addition,  for  a  given  iteration,  all  of  the 
signal  values  (X0f  X-j^  ...,  Xn)  at  the  individual  taps  can  in 
theory  be  measured  in  parallel  at  the  same  time.  This  allows  a 
parallel  measurement  of  the  gradient  (via  equation  80) .  This  is 
in  contrast  to  the  DSD  algorithm  where  each  partial  derivative 
(d (MSE)  /  3WjJ  must  be  measured  sequentially  in  order  to  compute 
the  gradient  via  equation  70.  Thus  the  LMS  algorithm’ is  potential¬ 
ly  much  faster  than  the  DSD  algorithm. 
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RANDOM  SEARCH  ALGORITHM 


So  far,  two  adaptive  algorithms  have  been  considered:  Least 
Mean  Square  (LMS)  and  Differential  Steepest  Descent  (DSD) .  LMS 
adapts  faster  than  DSD.  LMS  does,  however,  require  knowledge  of 
the  signal  value  at  each  tap  of  the  programmable  transversal  filter 
(PTF) .  This  requirement  adds  additional  complexity  to  the  adaptive 
filter.  An  auxiliary  PTF  has  to  be  added  to  the  adaptive  filter. 
The  tap  signal  values  are  measured  on  the  auxiliary  PTF  so  as  not 
to  interfere  with  the  operation  of  the  "main"  PTF.  DSD  is  more 
general  than  LMS,  but  it  requires  that  all  the  partial  derivatives 
of  mean  square  error  with  respect  to  the  weights  (3 (MSE)  /  3WjJ  be 
measured  (sequentially) .  In  addition,  the  MSE  must  be  measured  a 
number  of  times  to  insure  accuracy.  Random  search  algorithms  do 
not  re'quire  knowledge  of  the  signal  at  each  tap  of  the  PTF  as  does 
LMS.  Nor  do  they  require  measurement  of  3  (MSE)  /  3Wj^  as  does  DSD. 
Random  search  algorithms  tend  to  be  slower  than  LMS,  but  faster 
than  DSD.  DSD,  however,  will  outperform  random  search  algorithms 
in  terms  of  certain  performance  measures  that  are  beyond  the  scope 
of  this  report.  Random  search  algorithms  are  useful  when  LMS 
cannot  be  applied,  i.e.,  when  the  adaptive  filter  is  not  an  adap¬ 
tive  linear  combiner  or  PTF  or  when  its  complexity  is  not  "afford¬ 
able"  . 

One  of  the  most  efficient  random  search  algorithms  is  the 
Linear  Random  Search  (LRS)  algorithm.  In  LRS :  "a  small  random 
change  is  tentatively  added  to  the  weight  vector  at  the  begin¬ 
ning  of  each  iteration.  The  corresponding  change  in  mean  square 
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error  performance  is  observed.  A  permanent  weight  vector  change, 
proportional  to  the  product  of  the  change  in  performance  and  the 
initial  tentative  change,  is  then  made."7 

The  new  weight  vector  generated  by  the  LRS  algorithm  is 
given  by 

A  A 

Wk+1  =  wk  +  M  [£  (wk)  -  £  (Wk  +  Mk)  ]  Mk  (82) 

where: 

is  a  random  vector. 

f  (Wk)  is  an  estimate  of  mean  square  error  at  W  =  Wk  based  on 
N  samples. 

A 

£  (wk  +  Mk)  is  an  estimate  of  mean  square  error  at  W  =  Wk  + 
based  on  N  samples. 

H  is  a  design  constant  affecting  stability  and  rate  of  adaptation. 
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PTF  HARDWARE  IMPLEMENTATION 


Although  the  primary  purpose  of  this  report  is  to  describe  the 
theoretical  principles  of  adaptive  noise  canceling,  this  section 
will  be  devoted  to  a  description  of  a  SAW  device  implementation  of 
a  PTF. 

"Several  programmable  SAW  filters  have  been  reported  in  the 

literature. 10-13  Most  are  used  for  match  filter  operation.  A 

SAW/FET  approach  demonstrated  50  MHz  of  bandwidth  centered  at  150 

MHz.  However,  tap  control  range  was  limited  to  16  dB  and  single 

14 

tap  insertion  loss  was  80  dB.  A  monolithic  GaAs  approach  in 
which  the  SAW  and  the  FETs  are  implemented  on  the  same  substrate 
has  demonstrated  58  dB  dynamic  range  at  500  MHz  over  a  50  MHz 
bandwidth.  " 

A  promising  approach  suitable  for  use  in  an  adaptive  noise 
canceler,  is  a  hybrid  programmable  transversal  filter  (HPTF).16'17 
All  programmable  transversal  filter  designs  reported  to  date  are 
severely  limited  by  poor  tap  weight  control  range  (which  limits 
filter  sidelobe  performance)  and  poor  dynamic  range  (which  limits 
sensitivity) .  The  HPTF  solves  both  of  these  problems  by  combining 
a  LiNb03  SAW  device  for  high  dynamic  range  with  GaAs  dual-gate  FETs 
for  high  tap  weight  control  range.  Measured  tap  weight  control 
range  (70  dB)  and  dynamic  range  (85  dB  over  a  100  MHz  bandwidth) 
are  high  enough  to  meet  many  system  requirements. 

"The  HPTF  consists  of  a  tapped  SAW  delay  line  whose  output 
electrodes  are  connected  to  an  array  of  tap  weight  control  dual¬ 
gate  FETs  (Figure  10) .  The  signal  is  applied  to  an  input  trans- 
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ducer,  which  generates  a  surface  acoustic  wave  that  propagates 
down  the  substrate.  An  array  of  output  transducers  transforms  this 
acoustic  wave  back  into  electrical  signals  that  are  delayed  copies 
of  the  original  input.  Each  output  transducer  is  connected  to  the 
input  (gate-1)  of  a  dual-gate  FET  ( DGFET)  tap  weight  control  ampli¬ 
fier.  The  tap  weight  is  controlled  by  gate-2  voltage.  The  DGFET 
outputs  (drains)  are  connected  to  a  common  current  summing  bus. 

The  transversal  filter  can  now  be  identified  by  the  process  of 
shift,  multiply  and  sum.  Negative  tap  weights  are  generated  with  a 
second  DGFET  array  whose  output  is  inverted  by  an  external  differ¬ 
ential  amplifier.  This  alleviates  the  need  for  an  invertor  at  each 
tap. 1,16 ' 17 

The  maximum  power  handling  capability  of  an  HPTF  is  limited 
by  the  power  that  can  be  safely  applied  to  the  SAW  input  trans¬ 
ducer  (about  +20  dBm) . 

Typically,  when  used  either  as  a  bandpass  or  notch  filter,  an 
HPTF  can  reduce  interfering  signals  by  40-50  dB.  A  single  tap 
weight  on  the  HPTF  can  be  changed  in  approximately  1  microsecond. 

To  change  an  entire  set  of  tap  weights  to  a  second  set  will  usually 
take  much  longer.  A  16-tap  HPTF  has  16  weights  to  be  changed.  If 
this  is  done  serially,  then  the  single  tap  switching  time  of  1 
microsecond  must  be  multiplied  by  16.  In  reality,  a  128  tap  filter 
will  be  needed.  So  a  1  microsecond  switching  time  per  tap  must  be 
multiplied  by  128.  In  addition,  a  controller  must  address  and 
transfer  the  tap  weights  to  the  HPTF.  The  transfer  time  per 
tap  could  be  much  larger  than  the  single  tap  switching  time.  If 
the  HPTF  is  included  in  an  adaptive  noise  canceler,  then  a  number 
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TAP  WEIGHT  CONTROL  VOLTAGE  INPUTS 


Figure  10.  Hybrid  Programmable  Transversal  Filter  (HPTF) 
Concept  (From  Ref.  16  and  17) 


of  tap  weight  sets  will  have  to  be  transferred  from  the  controller 
to  the  HPTF.  The  output  pov/er  of  the  HPTF  will  have  to  be  measured 
and  transferred  to  the  controller. 

If  Widrow's  algorithm  is  used,  the  signals  on  each  tap  have  to 
be  measured  and  transferred  to  the  controller.  For  each  tap,  the 
controller  will  then  have  to  calculate  a  new  weight.  The  speed  of 
the  calculation  will  depend  on  the  speed  of  the  controller.  All 
this  overhead  implies  a  much  longer  time  to  achieve  adaptive  con¬ 
vergence  (in  an  adaptive  noise  canceler)  than  to  simply  switch  a 
single  tap  weight. 

It  is  expected  that  a  123-tap  HPTF  type  filter  will  be  able 
to  achieve  30  dB  of  filtering  (in  an  adaptive  noise  canceler 
configuration)  in  approximately  1  millisecond.  A  128  tap  HPTF 
type  filter  is  currently  being  developed  for  ETDL  by  Texas  Instru¬ 
ments  under  Contract  No.  DAAL01-88-C-0831. 
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CONCLUSIONS 


The  theoretical  principles  developed  within  this  report  (i.e., 
the  mathematical  structure  of  the  autocorrelation  matrix  R,  the 
cross-correlation  vector  P,  and  the  Wiener  or  optimal  weight  vector 
W*)  imply  that  adaptive  noise  canceling  is  a  viable  method  of 
separating  weak  and  strong  signals. 

If  both  the  intended  and  interfering  signals  are  narrow-band, 
then  an  adaptive  noise  canceler  with  a  single  input  is  the  appro¬ 
priate  filter  structure.  This  is  because,  as  shown  in  the  "Analy¬ 
sis  of  an  Adaptive  Noise  Canceler  with  a  Single  Input"  section,  the 
optimal  weight  vector  W*  will  be  dominated  or  determined  by  the 
strong  interferer.  This  will  cause  the  programmable  transversal 
filter  (PTF)  to  form  a  bandpass  around  the  strong  interferer,  pass 
the  interferer,  and  reject  the  intended  signal.  The  output  of  the 
PTF  (the  filtered  interfering  signal)  is  then  subtracted  from  the 
signal  plus  interference  at  the  output  power  combiner  and  yields 
the  intended  signal. 

For  separating  narrow-band  and  random  wide-band  signals,  the 
adaptive  noise  canceler  must  be  configured  as  an  adaptive  line 
enhancer.  As  was  shown  in  the  "Analysis  of  an  Adaptive  Line 
Enhancer"  section,  an  appropriate  delay  before  the  PTF  in  the  ALE 
will  cause  a  passband  to  appear  (in  the  PTF  frequency  response 
curve)  around  the  narrow-band  signal.  Most  of  the  random  wide-band 
signal  will  then  be  filtered  out.  The  resulting  narrow-band  signal 
will  be  subtracted  from  the  sum  of  noth  signals  (at  the  output 


power  combiner) .  The  output  of  the  combiner  is  the  wide-band 
signal.  In  this  way  signal  separation  is  achieved. 

The  choice  of  an  adaptive  algorithm  for  an  adaptive  noise 
canceler  depends  on  several  factors.  If  adaptation  time  is  most 
important,  then  Least  Mean  Squares  (LMS)  should  be  chosen.  If 
simplicity  and  hardware  costs  are  the  driving  factors,  then  a 
random  search  algorithm  such  as  the  Linear  Random  Search  (LRS) 
should  be  chosen.  If  the  adaptive  filter  is  not  an  adaptive  linear 
combiner  or  programmable  transversal  filter,  then  the  Differential 
Steepest  Descent  (DSD)  algorithm  or  a  Random  Search  Algorithm  would 
be  appropriate  choices  since  neither  of  these  algorithms  assume  a 
transversal  filter  structure  for  the  adaptive  filter  in  the  ALE 
(LMS  algorithm  does  assume  that  the  adaptive  filter  is  a  transver¬ 
sal  filter) . 
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